Corpus-based part-of-speech disambiguation of Persian
In this paper we introduce a method for part-ofspeech disambiguation of Persian texts, which uses word class probabilities in a relatively small training corpus in order to automatically tag unrestricted Persian texts. The experiment has been carried out in two levels as unigram and bi-gram genotypes disambiguation. Comparing the results gained from the two levels, we show that using immediate right context to which a given word belongs can increase the accuracy rate of the system to a high degree.
Collection theides; additional_collections